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SEQUENCE LISTING 
(1) GENERAL INFORMATION 
(i) APPLICANT: Therion, Corporation 

(ii) TITLE OF THE INVENTION : Recombinant Pox Virus For 

Immunization Against MUC1 Tumor-Associated Antigen 

(iii) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Dike, Bronstein, Roberts & Cushman, LLP 

(B) STREET: 130 Water Street 

(C) CITY: Boston 

(D) STATE: MA 

(E) COUNTRY: USA 
N 1 (F) ZIP: 02109 

r*5 

(v) COMPUTER READABLE FORM: 
;?s (A) MEDIUM TYPE: Diskette 

<B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US98/03693 

(B) FILING DATE: 24-FEB-1998 
W (C) CLASSIFICATION: 



T5S- S 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/038,253 

(B) FILING DATE: 24-FEB-1997 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Eisenstein, Ronald I 

(B) REGISTRATION NUMBER: 30,628 

(C) REFERENCE /DOCKET NUMBER: 953/47113-pct 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617-523-3400 

(B) TELEFAX: 617-523-6440 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Gly Ser Thr Ala Pro Pro Ala His Giy Val Thr Ser Ala Pro Asp Thr 

1 5 10 15 

Arg Pro Ala Pro 

20 
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(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GGCTCCACCG CCCCCCCAGC CCACGGTGTC ACCTCGGCCC CGGACACCAG GCCGGCCCCG 60 
(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Pro Asp Thr Arg Pro Ala Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
3GCAG7ACTG CACCACCGGC ACATGGCGTA AC AT C AGC AC CTGATACAAG ACCTGCACCT 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
SGATCCACCG CGCCGCCTGC GCACGGAGTG ACGTCGGCGC CCGACACGCG CCCCGCTCCC 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GGGTCAACAG CTCCTCCCGC TCATGGGGTT ACTTCTGCTC CAGATACTCG CCCAGCTCCA 
(2} INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGTTCGACGG CCCCCCCTGC TCACGGTGTA ACATCCGCCC CGGATACCAG ACCGGCCCCT 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGCAGCACCG CACCGCCCGC ACACGGGGTC ACAAGCGCGC CAGACACTCG ACCTGCGCCA 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGAAGTACCG CTCCACCTGC ACACGGGGTC ACAAGCGCGC CAGACACTCG ACCTGCGCCA 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



GGGTCGACTG CCCCTCCGGC GCATGGTGTG ACCTCAGCTC CTGACACAAG GCCAGCCCCA 



36 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGTTCAACGG CACCTCCAGC ACACGGAGTC ACGTCTGCAC CCGACACCCG TCCAGCTCCG 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGTAGTACAG CGCCACCCGC ACATGGCGTC ACGAGCGCTC CGGATACGAG ACCGGCGCCT 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGCTCCACCG CACCCCCAGC CCACGGTGTC ACCTCGGCCC CGGACACCAG GCGGGCCCCG 60 
GGCTCCACCC CGGCCCCG 78 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GGCTCCACCG CCCCCCCAGC CCATGGTGTC ACCTCGGCCC CGGACAACAG GCCCGCCTTG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGCTCCACCG CCCCTCCAGT CCACAATGTC ACCTCGGCC 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp Thr 

1 5 10 15 

Arg Arg Ala Pro 

20 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp Asn 

15 10 15 

Arg Pro Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gly Ser Thr Ala Pro Pro Val His Asn Val Thr Ser Ala 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1527 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 1. . . 1524 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATG ACA CCG GGC ACC CAG TCT CCT TTC TTC CTG CTG CTG CTC CTC ACA 4 8 

Met Thr Pro Gly Thr Gin Ser Pro Phe Phe Leu Leu Leu Leu Leu Thr 
15 10 15 

GTG CTT ACA GCT ACC ACA GCC CCT AAA CCC GCA ACA GTT GTT ACG GGT 96 
Val Leu Thr Ala Thr Thr Ala Pro Lys Pro Ala Thr Val Val Thr Gly 
20 25 30 

TCT GGT CAT GCA AGC TCT ACC CCA GGT GGA GAA AAG GAG ACT TCG GCT 14 4 

Ser Gly His Ala Ser Ser Thr Pro Gly Gly Glu Lys Glu Thr Ser Ala 
35 40 45 

o 

fi ACC CAG AGA AGT TCA GTG CCC AGC TCT ACT GAG AAG AAT GCT GTG AGT 192 

Thr Gin Arg Ser Ser Val Pro Ser Ser Thr Glu Lys Asn Ala Val Ser 
50 55 60 



UJ ATG ACA AGC TTG ATA TCG AAT TCC GGT GTC CGG GGC TCC ACC GCC CCC 2 40 

yl Met Thr Ser Leu lie Ser Asn Ser Gly Val Arg Gly Ser Thr Ala Pro 

~ 65 70 75 80 

CCA GCC CAC GGT GTC ACC TCG GCC CCG GAC ACC AGG CCG GCC CCG GGC 2 88 

^ Pro Ala His Gly Val Thr Ser Ala Pro Asp Thr Arg Pro Ala Pro Gly 

RJ 85 90 95 

j: ri 

%i TCC ACC GCC CCC CCA GCC CAC GGT GTC ACC TCG GCC CCG GAC ACC AGG 336 



Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp Thr Arg 



100 105 110 

CCG GCC CCG GGC TCC ACC GCC CCC CCA GCC CAC GGT GTC ACC TCG GCC 38 4 

Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala 
115 d 120 125 

CCG GAC ACC AGG CCG GCC CCG GGC TCC ACC GCA CCC CCA GCC CAC GGT 4 32 

Pro Asp Thr Arg Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly 
130 135 140 

GTC ACC TCG GCC CCG GAC ACC AGG CGG GCC CCG GGC TCC ACC CCG GCC 480 
Val Thr Ser Ala Pro Asp Thr Arg Arg Ala Pro Gly Ser Thr Pro Ala 
145 150 155 160 

CCG GGC TCC ACC GCC CCC CCA GCC CAC GGT GTC ACC TCG GCC CCG GAC 528 
Pro Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp 

165 170 175 

ACC AGG CCG GCC CCG GGC TCC ACC GCC CCC CCA GCC CAT GGT GTC ACC 57 6 

Thr Arg Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr 
180 ~ 185 190 

TCG GCC CCG GAC AAC AGG CCC GCC TTG GGC TCC ACC GCC CCT CCA GTC 624 
Ser Ala Pro Asp Asn Arg Pro Ala Leu Gly Ser Thr Ala Pro Pro Val 
195 " 200 205 
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CAC AAT GTC ACC TCG GCC TCA GGC TCT GCA TCA GGC TCA GCT TCT ACT 672 
His Asn Val Thr Ser Ala Ser Gly Ser Ala Ser Gly Ser Ala Ser Thr 
210 215 220 

CTG GTG CAC AAC GGC ACC TCT GCC AGG GCT ACC ACA ACC CCA GCC AGC 720 
Leu Val His Asn Gly Thr Ser Ala Arg Ala Thr Thr Thr Pro Ala Ser 
225 230 235 240 

AAG AGC ACT CCA TTC TCA ATT CCC AGC CAC CAC TCT GAT ACT CCT ACC 7 68 

Lys Ser Thr Pro Phe Ser lie Pro Ser His His Ser Asp Thr Pro Thr 

245 250 255 

ACC CTT GCC AGC CAT AGC ACC AAG ACT GAT GCC AGT AGC ACT CAC CAT 816 
Thr Leu Ala Ser His Ser Thr Lys Thr Asp Ala Ser Ser Thr His His 
260 265 270 

AGC ACG GTA CCT CCT CTC ACC TCC TCC AAT CAC AGC ACT TCT CCC CAG 8 64 

yk Ser Thr Val Pro Pro Leu Thr Ser Ser Asn His Ser Thr Ser Pro Gin 

U 275 280 285 

TTG TCT ACT GGG GTC TCT TTC TTT TTC CTG TCT TTT CAC ATT TCA AAC 912 
Leu Ser Thr Gly Val Ser Phe Phe Phe Leu Ser Phe His He Ser Asn 

%i 2 90 ' 295 300 

| x | CTC CAG TTT CCT TCC TCT CTC GAA GAT CCC AGC ACC GAC TAC TAC CAA 960 

IS Leu Gin Phe Pro Ser Ser Leu Glu Asp Pro Ser Thr Asp Tyr Tyr Gin 

y ' 305 310 315 320 

2s 

Q GAG CTG CAG AGA GAC ATT TCT CAA ATG TTT TTG CAG ATT TAT AAA CAA 1008 

Glu Leu Gin Arg Asp He Ser Gin Met Phe Leu Gin He Tyr Lys Gin 

325 330 335 



GGG GGT TTT CTG GGC CTC TCC AAT ATT AAG TTC AGG CCA GGA TCT GTG 1056 
Gly Gly Phe Leu Gly Leu Ser Asn He Lys Phe Arg Pro Gly Ser Val 
340 345 350 

CTG GTA CAA TTG ACT CTG GCC TTC CGA GAA GGT ACC ATC AAT GTC CAC 1104 
Leu Val Gin Leu Thr Leu Ala Phe Arg Glu Gly Thr He Asn Val His 
355 360 365 

GAC GTG GAG ACA CAG TTC AAT CAG TAT AAA ACG GAA GCA GCC TCT CGA 1152 
Asp Val Glu Thr Gin Phe Asn Gin Tyr Lys Thr Glu Ala Ala Ser Arg 
370 375 380 

TAT AAC CTG ACG ATC CCA GAC GTC AGC GTG AGT GAT GTG CCA TTT CCT 1200 
Tyr Asn Leu Thr He Pro Asp Val Ser Val Ser Asp Val Pro Phe Pro 
385 390 395 400 

TTC TCT GCC CAG TCT GGG GCT GGG GTG CCA GGC TGG GGC ATC GCG CTG 124 8 
Phe Ser Ala Gin Ser Gly Ala Gly Val Pro Gly Trp Gly lie Ala Leu 

405 410 415 

CTC CTG CTG GTC TGT GTT CTG GTT GCG CTG GCC ATT GTC TAT CTC ATT 12 96 
Leu Leu Leu Val Cys Val Leu Val Ala Leu Ala lie Val Tyr Leu lie 
420 425 430 

GCC TTG GCT GTC TGT CAG TGC CGC CGA AAG AAC TAC GGG CAG CTG GAC 134 4 
Ala Leu Ala Val Cys Gin Cys Arg Arg Lys Asn Tyr Gly Gin Leu Asp 
435 440 445 
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ATC TTT CCA GCC CGG GAT ACC TAC CAT 
lie Phe Pro Ala Arg Asp Thr Tyr His 
450 455 

TAC CAC ACC CAT GGG CGC TAT GTC CCC 
Tyr His Thr His Gly Arg Tyr Val Pro 
465 470 

CCC TAT GAG AAG GTT TCT GCA GGT AAT 
Pro Tyr Glu Lys Val Ser Ala Gly Asn 

485 

ACA AAC CCA GCA GTG GCA GCC ACT TCT 
Thr Asn Pro Ala Val Ala Ala Thr Ser 
500 505 



CCT ATG AGC GAG TAC CCC ACC 13 92 
Pro Met Ser Glu Tyr Pro Thr 
460 

CCT AGC AGT ACC GAT CGT AGC 14 40 
Pro Ser Ser Thr Asp Arg Ser 
475 480 

GGT GGC AGC AGC CTC TCT TAC 14 88 
Gly Gly Ser Ser Leu Ser Tyr 
490 495 

GCC AAC TTG TAG 1527 
Ala Asn Leu 



(2) INFORMATION FOR SEQ ID NO: 20: 



£3 (i) SEQUENCE CHARACTERISTICS: 
|15 (A) LENGTH: 508 amino acids 

5f: <B) TYPE: amino acid 

% z (C) STRANDEDNESS : single 

M* (D) TOPOLOGY: linear 

m (ii) MOLECULE TYPE: protein 

(v) FRAGMENT TYPE: internal 

Q (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 



■ TO' S 



hi Met Thr Pro Gly Thr Gin Ser Pro Phe Phe Leu Leu Leu Leu Leu Thr 

15 10 15 

Val Leu Thr Ala Thr Thr Ala Pro Lys Pro Ala Thr Val Val Thr Gly 

20 25 30 

Ser Gly His Ala Ser Ser Thr Pro Gly Gly Glu Lys Glu Thr Ser Ala 

35 40 45 

Thr Gin Arg Ser Ser Val Pro Ser Ser Thr Glu Lys Asn Ala Val Ser 

50 55 60 

Met Thr Ser Leu He Ser Asn Ser Gly Val Arg Gly Ser Thr Ala Pro 
65 70 75 80 

Pro Ala His Gly Val Thr Ser Ala Pro Asp Thr Arg Pro Ala Pro Gly 

85 90 95 

Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp Thr Arg 

100 105 HO 

Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala 

115 120 125 

Pre Asp Thr Arg Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly 

130 ' 135 140 

Val Thr Ser Ala Pro Asp Thr Arg Arg Ala Pro Gly Ser Thr Pro Ala 
145 150 155 160 

Pro Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr Ser Ala Pro Asp 

165 1*70 175 

Thr Arg Pro Ala Pro Gly Ser Thr Ala Pro Pro Ala His Gly Val Thr 

180 * 185 190 

Ser Ala Pro Asp Asn Arg Pro Ala Leu Gly Ser Thr Ala Pro Pro Val 

195 200 205 

His Asn Val Thr Ser Ala Ser Gly Ser Ala Ser Gly Ser Ala Ser Thr 

210 215 220 

Leu Val His Asn Gly Thr Ser Ala Arg Ala Thr Thr Thr Pro Ala Ser 
225 230 235 240 

Lys Ser Thr Pro Phe Ser He Pro Ser His His Ser Asp Thr Pro Thr 
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